We will clean the environment, setup the locations, define colors, and create a datestamp.
Clean the environment.
Set locations and working directories…
Create a new analysis directory...
[1] FALSE
[1] FALSE
[1] FALSE
[1] FALSE
[1] FALSE
[1] FALSE
[1] "/Users/swvanderlaan/PLINK/analyses/consortia/CHARGE_1000G_CAC"
[1] "_archived" "1. CHARGE_1000G_CAC.nb.html" "1. CHARGE_1000G_CAC.Rmd"
[4] "2. bulkRNAseq.nb.html" "2. bulkRNAseq.Rmd" "20220128.CAC.RegionalAssociationPlots.RData"
[7] "20220129.CAC.Parsing_GWASSumStats.RData" "20220131.CAC.PolarMorphism.RData" "20220201.CAC.PolarMorphism.RData"
[10] "20220202.CAC.PolarMorphism.RData" "3. scRNAseq.nb.html" "3. scRNAseq.Rmd"
[13] "4. Parsing_GWASSumStats.nb.html" "4. Parsing_GWASSumStats.Rmd" "5. RegionalAssociationPlots.nb.html"
[16] "5. RegionalAssociationPlots.Rmd" "6. PolarMorphism.nb.html" "6. PolarMorphism.Rmd"
[19] "CAC" "CHARGE_1000G_CAC.Rproj" "CredibleSets"
[22] "images" "LICENSE" "PolarMorphism"
[25] "RACER" "README.html" "README.md"
[28] "README.orig.md" "renv" "renv.lock"
[31] "scripts" "SNP" "targets"
… a package-installation function …
source(paste0(PROJECT_loc, "/scripts/functions.R"))… and load those packages.
install.packages.auto("readr")Loading required package: readr
install.packages.auto("optparse")Loading required package: optparse
install.packages.auto("tools")Loading required package: tools
install.packages.auto("dplyr")Loading required package: dplyr
Attaching package: ‘dplyr’
The following objects are masked from ‘package:stats’:
filter, lag
The following objects are masked from ‘package:base’:
intersect, setdiff, setequal, union
install.packages.auto("tidyr")Loading required package: tidyr
install.packages.auto("naniar")Loading required package: naniar
# To get 'data.table' with 'fwrite' to be able to directly write gzipped-files
# Ref: https://stackoverflow.com/questions/42788401/is-possible-to-use-fwrite-from-data-table-with-gzfile
# install.packages("data.table", repos = "https://Rdatatable.gitlab.io/data.table")
library(data.table)Registered S3 method overwritten by 'data.table':
method from
print.data.table
data.table 1.14.2 using 1 threads (see ?getDTthreads). Latest news: r-datatable.com
**********
This installation of data.table has not detected OpenMP support. It should still work but in single-threaded mode.
This is a Mac. Please read https://mac.r-project.org/openmp/. Please engage with Apple and ask them for support. Check r-datatable.com for updates, and our Mac instructions here: https://github.com/Rdatatable/data.table/wiki/Installation. After several years of many reports of installation problems on Mac, it's time to gingerly point out that there have been no similar problems on Windows or Linux.
**********
Attaching package: ‘data.table’
The following objects are masked from ‘package:dplyr’:
between, first, last
install.packages.auto("tidyverse")Loading required package: tidyverse
Registered S3 methods overwritten by 'dbplyr':
method from
print.tbl_lazy
print.tbl_sql
── Attaching packages ─────────────────────────────────────────────────────────────────────────────────────────────────────────── tidyverse 1.3.1 ──
✓ ggplot2 3.3.5 ✓ stringr 1.4.0
✓ tibble 3.1.6 ✓ forcats 0.5.1
✓ purrr 0.3.4
── Conflicts ────────────────────────────────────────────────────────────────────────────────────────────────────────────── tidyverse_conflicts() ──
x data.table::between() masks dplyr::between()
x dplyr::filter() masks stats::filter()
x data.table::first() masks dplyr::first()
x dplyr::lag() masks stats::lag()
x data.table::last() masks dplyr::last()
x purrr::transpose() masks data.table::transpose()
install.packages.auto("knitr")Loading required package: knitr
install.packages.auto("DT")Loading required package: DT
install.packages.auto("eeptools")Loading required package: eeptools
Welcome to eeptools for R version 1.2.0!
Developed by Jared E. Knowles 2012-2018
for the Wisconsin Department of Public Instruction
Distributed without warranty.
install.packages.auto("haven")Loading required package: haven
install.packages.auto("tableone")Loading required package: tableone
install.packages.auto("BlandAltmanLeh")Loading required package: BlandAltmanLeh
# Install the devtools package from Hadley Wickham
install.packages.auto('devtools')Loading required package: devtools
Loading required package: usethis
library(devtools)
# for plotting
install.packages.auto("pheatmap")Loading required package: pheatmap
install.packages.auto("forestplot")Loading required package: forestplot
Loading required package: grid
Loading required package: magrittr
Attaching package: ‘magrittr’
The following object is masked from ‘package:purrr’:
set_names
The following object is masked from ‘package:tidyr’:
extract
Loading required package: checkmate
install.packages.auto("ggplot2")
install.packages.auto("ggpubr")Loading required package: ggpubr
install.packages.auto("ggrepel")Loading required package: ggrepel
install.packages.auto("UpSetR")Loading required package: UpSetR
devtools::install_github("thomasp85/patchwork")Using github PAT from envvar GITHUB_PAT
Skipping install of 'patchwork' from a github remote, the SHA1 (79223d30) has not changed since last install.
Use `force = TRUE` to force installation
# For regional association plots
install_github("oliviasabik/RACER") Using github PAT from envvar GITHUB_PAT
Skipping install of 'RACER' from a github remote, the SHA1 (1394c9d4) has not changed since last install.
Use `force = TRUE` to force installation
# Install ggrepel package if needed
library(ggrepel)
# install ggsci
install.packages.auto("ggsci")Loading required package: ggsci
# plotly
# install.packages.auto("plotly")We will create a datestamp and define the Utrecht Science Park Colour Scheme.
Today = format(as.Date(as.POSIXlt(Sys.time())), "%Y%m%d")
Today.Report = format(as.Date(as.POSIXlt(Sys.time())), "%A, %B %d, %Y")
### UtrechtScienceParkColoursScheme
###
### WebsitetoconvertHEXtoRGB:http://hex.colorrrs.com.
### Forsomefunctionsyoushoulddividethesenumbersby255.
###
### No. Color HEX (RGB) CHR MAF/INFO
###---------------------------------------------------------------------------------------
### 1 yellow #FBB820 (251,184,32) => 1 or 1.0>INFO
### 2 gold #F59D10 (245,157,16) => 2
### 3 salmon #E55738 (229,87,56) => 3 or 0.05<MAF<0.2 or 0.4<INFO<0.6
### 4 darkpink #DB003F ((219,0,63) => 4
### 5 lightpink #E35493 (227,84,147) => 5 or 0.8<INFO<1.0
### 6 pink #D5267B (213,38,123) => 6
### 7 hardpink #CC0071 (204,0,113) => 7
### 8 lightpurple #A8448A (168,68,138) => 8
### 9 purple #9A3480 (154,52,128) => 9
### 10 lavendel #8D5B9A (141,91,154) => 10
### 11 bluepurple #705296 (112,82,150) => 11
### 12 purpleblue #686AA9 (104,106,169) => 12
### 13 lightpurpleblue #6173AD (97,115,173/101,120,180) => 13
### 14 seablue #4C81BF (76,129,191) => 14
### 15 skyblue #2F8BC9 (47,139,201) => 15
### 16 azurblue #1290D9 (18,144,217) => 16 or 0.01<MAF<0.05 or 0.2<INFO<0.4
### 17 lightazurblue #1396D8 (19,150,216) => 17
### 18 greenblue #15A6C1 (21,166,193) => 18
### 19 seaweedgreen #5EB17F (94,177,127) => 19
### 20 yellowgreen #86B833 (134,184,51) => 20
### 21 lightmossgreen #C5D220 (197,210,32) => 21
### 22 mossgreen #9FC228 (159,194,40) => 22 or MAF>0.20 or 0.6<INFO<0.8
### 23 lightgreen #78B113 (120,177,19) => 23/X
### 24 green #49A01D (73,160,29) => 24/Y
### 25 grey #595A5C (89,90,92) => 25/XY or MAF<0.01 or 0.0<INFO<0.2
### 26 lightgrey #A2A3A4 (162,163,164) => 26/MT
###
### ADDITIONAL COLORS
### 27 midgrey #D7D8D7
### 28 verylightgrey #ECECEC"
### 29 white #FFFFFF
### 30 black #000000
###----------------------------------------------------------------------------------------------
uithof_color = c("#FBB820","#F59D10","#E55738","#DB003F","#E35493","#D5267B",
"#CC0071","#A8448A","#9A3480","#8D5B9A","#705296","#686AA9",
"#6173AD","#4C81BF","#2F8BC9","#1290D9","#1396D8","#15A6C1",
"#5EB17F","#86B833","#C5D220","#9FC228","#78B113","#49A01D",
"#595A5C","#A2A3A4", "#D7D8D7", "#ECECEC", "#FFFFFF", "#000000")
uithof_color_legend = c("#FBB820", "#F59D10", "#E55738", "#DB003F", "#E35493",
"#D5267B", "#CC0071", "#A8448A", "#9A3480", "#8D5B9A",
"#705296", "#686AA9", "#6173AD", "#4C81BF", "#2F8BC9",
"#1290D9", "#1396D8", "#15A6C1", "#5EB17F", "#86B833",
"#C5D220", "#9FC228", "#78B113", "#49A01D", "#595A5C",
"#A2A3A4", "#D7D8D7", "#ECECEC", "#FFFFFF", "#000000")
### ----------------------------------------------------------------------------We will parse the data to create regional association plots for each of the 11 loci.
library("scales")
Attaching package: ‘scales’
The following object is masked from ‘package:purrr’:
discard
The following object is masked from ‘package:readr’:
col_factor
pal_npg("nrc")(10) [1] "#E64B35FF" "#4DBBD5FF" "#00A087FF" "#3C5488FF" "#F39B7FFF" "#8491B4FF" "#91D1C2FF" "#DC0000FF" "#7E6148FF" "#B09C85FF"
show_col(pal_npg("nrc")(10))
# show_col(pal_npg("nrc", alpha = 0.6)(10))We need to load the data first.
gwas_sumstats_racer <- readRDS(file = paste0(OUT_loc, "/gwas_sumstats_racer.rds"))We are interested in 11 top loci.
library(openxlsx)
variant_list <- read.xlsx(paste0(TARGET_loc, "/Variants.xlsx"), sheet = "TopLoci")
DT::datatable(variant_list)NALet’s do some plotting.
library(RACER)
# Make directory for plots
ifelse(!dir.exists(file.path(PROJECT_loc, "/RACER")),
dir.create(file.path(PROJECT_loc, "/RACER")),
FALSE)[1] FALSE
RACER_loc = paste0(PROJECT_loc,"/RACER")
variants_of_interest <- c(variant_list$rsID)
variants_of_interest_fewgenes <- c("rs9349379", "rs3844006", "rs2854746", "rs4977575", "rs9633535", "rs11063120", "rs9515203", "rs7182103")
for(VARIANT in variants_of_interest){
cat(paste0("Getting data for ", VARIANT,".\n"))
tempCHR <- subset(variant_list, rsID == VARIANT)[,5]
tempSTART <- subset(variant_list, rsID == VARIANT)[,17]
tempEND <- subset(variant_list, rsID == VARIANT)[,18]
tempVARIANTnr <- subset(variant_list, rsID == VARIANT)[,1]
cat("\nSubset required data.\n")
temp <- subset(gwas_sumstats_racer, Chr == tempCHR & (Position >= tempSTART & Position <= tempEND))
cat("\nFormatting association data.\n")
temp_f = RACER::formatRACER(assoc_data = temp, chr_col = 3, pos_col = 4, p_col = 5)
cat("\nGetting LD data.\n")
temp_f_ld = RACER::ldRACER(assoc_data = temp_f, rs_col = 2, pops = "EUR", lead_snp = VARIANT)
cat(paste0("\nPlotting region surrounding ", VARIANT," on ",tempCHR,":",tempSTART,"-",tempEND,".\n"))
# source(paste0(PROJECT_loc, "/scripts/functions.R"))
p1 <- singlePlotRACER2(assoc_data = temp_f_ld,
chr = tempCHR, build = "hg19",
plotby = "snp", snp_plot = VARIANT,
label_lead = TRUE, gene_track_h = 2, gene_name_s = 1.75)
print(p1)
cat(paste0("Saving image for ", VARIANT,".\n"))
ggsave(filename = paste0(RACER_loc, "/", tempVARIANTnr, ".", Today, ".",VARIANT,".regional_assoc.png"), plot = last_plot())
ggsave(filename = paste0(RACER_loc, "/", tempVARIANTnr, ".", Today, ".",VARIANT,".regional_assoc.pdf"), plot = last_plot())
ggsave(filename = paste0(RACER_loc, "/", tempVARIANTnr, ".", Today, ".",VARIANT,".regional_assoc.eps"), plot = last_plot())
rm(temp, p1,
temp_f, temp_f_ld,
tempCHR, tempSTART, tempEND,
VARIANT, tempVARIANTnr)
}Getting data for rs9349379.
Subset required data.
Formatting association data.
Formating association data...
Processing -log10(p-values)...
Preparing association data...
Getting LD data.
All inputs are go!
Reading in association data...
Populations selected.
Calculating LD using rs9349379...
[1] "https://ldlink.nci.nih.gov/LDlinkRest/ldproxy?var=rs9349379&pop=EUR&r2_d=r2&token=c0f613f149ab"
[45%] Downloaded 32210 bytes...
[100%] Downloaded 70967 bytes...
Merging input association data with LD...
Plotting region surrounding rs9349379 on 6:12403957-13403957.
Plotting by...
snp rs9349379
Reading in association data
Determining lead SNP
Generating Plot
Saving image for rs9349379.
Saving 7.29 x 4.51 in image
Saving 7.29 x 4.51 in image
Saving 7.29 x 4.51 in image
Getting data for rs3844006.
Subset required data.
Formatting association data.
Formating association data...
Processing -log10(p-values)...
Preparing association data...
Getting LD data.
All inputs are go!
Reading in association data...
Populations selected.
Calculating LD using rs3844006...
[1] "https://ldlink.nci.nih.gov/LDlinkRest/ldproxy?var=rs3844006&pop=EUR&r2_d=r2&token=c0f613f149ab"
[100%] Downloaded 31016 bytes...
Merging input association data with LD...
Plotting region surrounding rs3844006 on 6:131595002-132595002.
Plotting by...
snp rs3844006
Reading in association data
Determining lead SNP
Generating Plot
Saving image for rs3844006.
Saving 7.29 x 4.51 in image
Saving 7.29 x 4.51 in image
Saving 7.29 x 4.51 in image
Getting data for rs2854746.
Subset required data.
Formatting association data.
Formating association data...
Processing -log10(p-values)...
Preparing association data...
Getting LD data.
All inputs are go!
Reading in association data...
Populations selected.
Calculating LD using rs2854746...
[1] "https://ldlink.nci.nih.gov/LDlinkRest/ldproxy?var=rs2854746&pop=EUR&r2_d=r2&token=c0f613f149ab"
[21%] Downloaded 23495 bytes...
[52%] Downloaded 57344 bytes...
[82%] Downloaded 90112 bytes...
[100%] Downloaded 109298 bytes...
Merging input association data with LD...
Plotting region surrounding rs2854746 on 7:45460645-46460645.
Plotting by...
snp rs2854746
Reading in association data
Determining lead SNP
Generating Plot
Saving image for rs2854746.
Saving 7.29 x 4.51 in image
Saving 7.29 x 4.51 in image
Saving 7.29 x 4.51 in image
Getting data for rs4977575.
Subset required data.
Formatting association data.
Formating association data...
Processing -log10(p-values)...
Preparing association data...
Getting LD data.
All inputs are go!
Reading in association data...
Populations selected.
Calculating LD using rs4977575...
[1] "https://ldlink.nci.nih.gov/LDlinkRest/ldproxy?var=rs4977575&pop=EUR&r2_d=r2&token=c0f613f149ab"
[100%] Downloaded 26000 bytes...
Merging input association data with LD...
Plotting region surrounding rs4977575 on 9:21624744-22624744.
Plotting by...
snp rs4977575
Reading in association data
Determining lead SNP
Generating Plot
Saving image for rs4977575.
Saving 7.29 x 4.51 in image
Saving 7.29 x 4.51 in image
Saving 7.29 x 4.51 in image
Getting data for rs10899970.
Subset required data.
Formatting association data.
Formating association data...
Processing -log10(p-values)...
Preparing association data...
Getting LD data.
All inputs are go!
Reading in association data...
Populations selected.
Calculating LD using rs10899970...
[1] "https://ldlink.nci.nih.gov/LDlinkRest/ldproxy?var=rs10899970&pop=EUR&r2_d=r2&token=c0f613f149ab"
[34%] Downloaded 55495 bytes...
[35%] Downloaded 57344 bytes...
[45%] Downloaded 73710 bytes...
[56%] Downloaded 90112 bytes...
[86%] Downloaded 139246 bytes...
[96%] Downloaded 155648 bytes...
[100%] Downloaded 160476 bytes...
Merging input association data with LD...
Plotting region surrounding rs10899970 on 10:44015716-45334720.
Plotting by...
snp rs10899970
Reading in association data
Determining lead SNP
Generating Plot
Saving image for rs10899970.
Saving 7.29 x 4.51 in image
Saving 7.29 x 4.51 in image
Saving 7.29 x 4.51 in image
Getting data for rs9633535.
Subset required data.
Formatting association data.
Formating association data...
Processing -log10(p-values)...
Preparing association data...
Getting LD data.
All inputs are go!
Reading in association data...
Populations selected.
Calculating LD using rs9633535...
[1] "https://ldlink.nci.nih.gov/LDlinkRest/ldproxy?var=rs9633535&pop=EUR&r2_d=r2&token=c0f613f149ab"
[100%] Downloaded 49363 bytes...
Merging input association data with LD...
Plotting region surrounding rs9633535 on 10:63336088-64336088.
Plotting by...
snp rs9633535
Reading in association data
Determining lead SNP
Generating Plot
Saving image for rs9633535.
Saving 7.29 x 4.51 in image
Saving 7.29 x 4.51 in image
Saving 7.29 x 4.51 in image
Getting data for rs10762577.
Subset required data.
Formatting association data.
Formating association data...
Processing -log10(p-values)...
Preparing association data...
Getting LD data.
All inputs are go!
Reading in association data...
Populations selected.
Calculating LD using rs10762577...
[1] "https://ldlink.nci.nih.gov/LDlinkRest/ldproxy?var=rs10762577&pop=EUR&r2_d=r2&token=c0f613f149ab"
[70%] Downloaded 65518 bytes...
[88%] Downloaded 81920 bytes...
[100%] Downloaded 92997 bytes...
Merging input association data with LD...
Plotting region surrounding rs10762577 on 10:75417431-76417431.
Plotting by...
snp rs10762577
Reading in association data
Determining lead SNP
Generating Plot
Saving image for rs10762577.
Saving 7.29 x 4.51 in image
Saving 7.29 x 4.51 in image
Saving 7.29 x 4.51 in image
Getting data for rs11063120.
Subset required data.
Formatting association data.
Formating association data...
Processing -log10(p-values)...
Preparing association data...
Getting LD data.
All inputs are go!
Reading in association data...
Populations selected.
Calculating LD using rs11063120...
[1] "https://ldlink.nci.nih.gov/LDlinkRest/ldproxy?var=rs11063120&pop=EUR&r2_d=r2&token=c0f613f149ab"
[38%] Downloaded 15980 bytes...
[100%] Downloaded 41883 bytes...
Merging input association data with LD...
Plotting region surrounding rs11063120 on 12:3986618-4986618.
Plotting by...
snp rs11063120
Reading in association data
Determining lead SNP
Generating Plot
Saving image for rs11063120.
Saving 7.29 x 4.51 in image
Saving 7.29 x 4.51 in image
Saving 7.29 x 4.51 in image
Getting data for rs9515203.
Subset required data.
Formatting association data.
Formating association data...
Processing -log10(p-values)...
Preparing association data...
Getting LD data.
All inputs are go!
Reading in association data...
Populations selected.
Calculating LD using rs9515203...
[1] "https://ldlink.nci.nih.gov/LDlinkRest/ldproxy?var=rs9515203&pop=EUR&r2_d=r2&token=c0f613f149ab"
[53%] Downloaded 15980 bytes...
[100%] Downloaded 30139 bytes...
Merging input association data with LD...
Plotting region surrounding rs9515203 on 13:110549623-111549623.
Plotting by...
snp rs9515203
Reading in association data
Determining lead SNP
Generating Plot
Saving image for rs9515203.
Saving 7.29 x 4.51 in image
Saving 7.29 x 4.51 in image
Saving 7.29 x 4.51 in image
Getting data for rs7182103.
Subset required data.
Formatting association data.
Formating association data...
Processing -log10(p-values)...
Preparing association data...
Getting LD data.
All inputs are go!
Reading in association data...
Populations selected.
Calculating LD using rs7182103...
[1] "https://ldlink.nci.nih.gov/LDlinkRest/ldproxy?var=rs7182103&pop=EUR&r2_d=r2&token=c0f613f149ab"
[19%] Downloaded 23495 bytes...
[55%] Downloaded 65518 bytes...
[83%] Downloaded 98286 bytes...
[100%] Downloaded 118299 bytes...
Merging input association data with LD...
Plotting region surrounding rs7182103 on 15:78623946-79623946.
Plotting by...
snp rs7182103
Reading in association data
Determining lead SNP
Generating Plot
Saving image for rs7182103.
Saving 7.29 x 4.51 in image
Saving 7.29 x 4.51 in image
Saving 7.29 x 4.51 in image
Getting data for rs7412.
Subset required data.
Formatting association data.
Formating association data...
Processing -log10(p-values)...
Preparing association data...
Getting LD data.
All inputs are go!
Reading in association data...
Populations selected.
Calculating LD using rs7412...
[1] "https://ldlink.nci.nih.gov/LDlinkRest/ldproxy?var=rs7412&pop=EUR&r2_d=r2&token=c0f613f149ab"
[100%] Downloaded 38865 bytes...
Merging input association data with LD...
Plotting region surrounding rs7412 on 19:44912079-45912079.
Plotting by...
snp rs7412
Reading in association data
Determining lead SNP
Generating Plot
Saving image for rs7412.
Saving 7.29 x 4.51 in image
Saving 7.29 x 4.51 in image
Saving 7.29 x 4.51 in image
variants_of_interest_manygenes <- c("rs7412", "rs10762577")
source(paste0(PROJECT_loc, "/scripts/functions.R"))
for(VARIANT in variants_of_interest_manygenes){
cat(paste0("Getting data for ", VARIANT,".\n"))
tempCHR <- subset(variant_list, rsID == VARIANT)[,5]
tempSTART <- subset(variant_list, rsID == VARIANT)[,17]
tempEND <- subset(variant_list, rsID == VARIANT)[,18]
tempVARIANTnr <- subset(variant_list, rsID == VARIANT)[,1]
cat("\nSubset required data.\n")
temp <- subset(gwas_sumstats_racer, Chr == tempCHR & (Position >= tempSTART & Position <= tempEND))
cat("\nFormatting association data.\n")
temp_f = RACER::formatRACER(assoc_data = temp, chr_col = 3, pos_col = 4, p_col = 5)
cat("\nGetting LD data.\n")
temp_f_ld = RACER::ldRACER(assoc_data = temp_f, rs_col = 2, pops = "EUR", lead_snp = VARIANT)
cat(paste0("\nPlotting region surrounding ", VARIANT," on ",tempCHR,":",tempSTART,"-",tempEND,".\n"))
p1 <- singlePlotRACER2(assoc_data = temp_f_ld,
chr = tempCHR, build = "hg19",
plotby = "snp", snp_plot = VARIANT,
label_lead = TRUE, gene_track_h = 0.75, gene_name_s = 1.75)
print(p1)
cat(paste0("Saving image for ", VARIANT,".\n"))
ggsave(filename = paste0(RACER_loc, "/", tempVARIANTnr, ".", Today, ".",VARIANT,".regional_assoc.png"), plot = last_plot())
ggsave(filename = paste0(RACER_loc, "/", tempVARIANTnr, ".", Today, ".",VARIANT,".regional_assoc.pdf"), plot = last_plot())
ggsave(filename = paste0(RACER_loc, "/", tempVARIANTnr, ".", Today, ".",VARIANT,".regional_assoc.eps"), plot = last_plot())
rm(temp, p1,
temp_f, temp_f_ld,
tempCHR, tempSTART, tempEND,
VARIANT, tempVARIANTnr)
}Getting data for rs7412.
Subset required data.
Formatting association data.
Formating association data...
Processing -log10(p-values)...
Preparing association data...
Getting LD data.
All inputs are go!
Reading in association data...
Populations selected.
Calculating LD using rs7412...
[1] "https://ldlink.nci.nih.gov/LDlinkRest/ldproxy?var=rs7412&pop=EUR&r2_d=r2&token=c0f613f149ab"
[19%] Downloaded 7634 bytes...
[100%] Downloaded 38865 bytes...
Merging input association data with LD...
Plotting region surrounding rs7412 on 19:44912079-45912079.
Plotting by...
snp rs7412
Reading in association data
Determining lead SNP
Generating Plot
Saving image for rs7412.
Saving 7.29 x 4.51 in image
Saving 7.29 x 4.51 in image
Saving 7.29 x 4.51 in image
Getting data for rs10762577.
Subset required data.
Formatting association data.
Formating association data...
Processing -log10(p-values)...
Preparing association data...
Getting LD data.
All inputs are go!
Reading in association data...
Populations selected.
Calculating LD using rs10762577...
[1] "https://ldlink.nci.nih.gov/LDlinkRest/ldproxy?var=rs10762577&pop=EUR&r2_d=r2&token=c0f613f149ab"
[34%] Downloaded 32210 bytes...
[77%] Downloaded 71862 bytes...
[100%] Downloaded 92997 bytes...
Merging input association data with LD...
Plotting region surrounding rs10762577 on 10:75417431-76417431.
Plotting by...
snp rs10762577
Reading in association data
Determining lead SNP
Generating Plot
Saving image for rs10762577.
Saving 7.29 x 4.51 in image
Saving 7.29 x 4.51 in image
Saving 7.29 x 4.51 in image
variants_of_interest_cxcl12 <- c("rs10899970")
source(paste0(PROJECT_loc, "/scripts/functions.R"))
for(VARIANT in variants_of_interest_cxcl12){
cat(paste0("Getting data for ", VARIANT,".\n"))
tempCHR <- subset(variant_list, rsID == VARIANT)[,5]
tempSTART <- subset(variant_list, rsID == VARIANT)[,17]
tempEND <- subset(variant_list, rsID == VARIANT)[,18]
tempVARIANTnr <- subset(variant_list, rsID == VARIANT)[,1]
cat("\nSubset required data.\n")
temp <- subset(gwas_sumstats_racer, Chr == tempCHR & (Position >= tempSTART & Position <= tempEND))
cat("\nFormatting association data.\n")
temp_f = RACER::formatRACER(assoc_data = temp, chr_col = 3, pos_col = 4, p_col = 5)
cat("\nGetting LD data.\n")
temp_f_ld = RACER::ldRACER(assoc_data = temp_f, rs_col = 2, pops = "EUR", lead_snp = VARIANT)
cat(paste0("\nPlotting region surrounding ", VARIANT," on ",tempCHR,":",tempSTART,"-",tempEND,".\n"))
p1 <- singlePlotRACER2(assoc_data = temp_f_ld,
chr = tempCHR, build = "hg19", set = "all",
plotby = "snp", snp_plot = VARIANT,
label_lead = TRUE, gene_track_h = 0.75, gene_name_s = 1.75)
print(p1)
cat(paste0("Saving image for ", VARIANT,".\n"))
ggsave(filename = paste0(RACER_loc, "/", tempVARIANTnr, ".", Today, ".",VARIANT,".regional_assoc.png"), plot = last_plot())
ggsave(filename = paste0(RACER_loc, "/", tempVARIANTnr, ".", Today, ".",VARIANT,".regional_assoc.pdf"), plot = last_plot())
ggsave(filename = paste0(RACER_loc, "/", tempVARIANTnr, ".", Today, ".",VARIANT,".regional_assoc.eps"), plot = last_plot())
rm(temp, p1,
temp_f, temp_f_ld,
tempCHR, tempSTART, tempEND,
VARIANT, tempVARIANTnr)
}Getting data for rs10899970.
Subset required data.
Formatting association data.
Formating association data...
Processing -log10(p-values)...
Preparing association data...
Getting LD data.
All inputs are go!
Reading in association data...
Populations selected.
Calculating LD using rs10899970...
[1] "https://ldlink.nci.nih.gov/LDlinkRest/ldproxy?var=rs10899970&pop=EUR&r2_d=r2&token=c0f613f149ab"
[45%] Downloaded 73710 bytes...
[66%] Downloaded 106478 bytes...
[86%] Downloaded 139246 bytes...
[100%] Downloaded 160476 bytes...
Merging input association data with LD...
Plotting region surrounding rs10899970 on 10:44015716-45334720.
Plotting by...
snp rs10899970
Reading in association data
Determining lead SNP
Generating Plot
Saving image for rs10899970.
Saving 7.29 x 4.51 in image
Saving 7.29 x 4.51 in image
Saving 7.29 x 4.51 in image
We want to create some regional association plots to combine with teh UCSC browser tracks, thus we need the exact same regions.
library(openxlsx)
add_list <- read.xlsx(paste0(TARGET_loc, "/Variants.xlsx"), sheet = "AdditionalPlots")
DT::datatable(add_list)NAWe want to color the credible sets, which we load here.
credset <- as_tibble(fread(paste0(PROJECT_loc, "/CredibleSets/CAC_EUR_AFR_cred_set_all_loci_50kb.txt")))
credsetWe want to add the posterior probabilities and make a variable to color by.
gwas_sumstats_racer_credset <- merge(gwas_sumstats_racer,
credset %>% select(RSID, Posterior_Prob),
sort = FALSE,
by.x = "rsID", by.y = "RSID", all.x = TRUE) %>%
# mutate(., Posterior_Prob = ifelse(is.na(Posterior_Prob), 0, Posterior_Prob)) %>%
mutate(CredSet = case_when(Posterior_Prob > 0 ~ '95% credible set',
TRUE ~ 'not in credible set'))
head(gwas_sumstats_racer_credset)
table(gwas_sumstats_racer_credset$CredSet)
95% credible set not in credible set
103 8585944
summary(gwas_sumstats_racer_credset$Posterior_Prob) Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
0 0 0 0 0 1 8585944
library(RACER)
# library(plotly)
# Make directory for plots
ifelse(!dir.exists(file.path(PROJECT_loc, "/RACER")),
dir.create(file.path(PROJECT_loc, "/RACER")),
FALSE)[1] FALSE
RACER_loc = paste0(PROJECT_loc,"/RACER")
variants_of_interest <- c(add_list$rsID)
for(VARIANT in variants_of_interest){
cat(paste0("Getting data for ", VARIANT,".\n"))
tempCHR <- subset(add_list, rsID == VARIANT)[,4]
tempSTART <- subset(add_list, rsID == VARIANT)[,5]
tempEND <- subset(add_list, rsID == VARIANT)[,6]
tempNAME <- subset(add_list, rsID == VARIANT)[,3]
cat("\nSubset required data.\n")
temp <- subset(gwas_sumstats_racer_credset, Chr == tempCHR & (Position >= tempSTART & Position <= tempEND))
cat("\nFormatting association data.\n")
temp_f = RACER::formatRACER(assoc_data = temp, chr_col = 3, pos_col = 4, p_col = 5)
cat("\nGetting LD data.\n")
# temp_f_ld = RACER::ldRACER(assoc_data = temp_f, rs_col = 2, pops = "EUR", lead_snp = VARIANT)
cat(paste0("\nPlotting region surrounding ", VARIANT," on ",tempCHR,":",tempSTART,"-",tempEND,".\n"))
p1 <- singlePlotRACER2(assoc_data = temp_f,
chr = tempCHR, build = "hg19",
plotby = "coord", snp_plot = VARIANT,
start_plot = tempSTART, end_plot = tempEND,
label_lead = FALSE,
grey_colors = FALSE,
cred_set = TRUE,
gene_track_h = 3, gene_name_s = 1.75)
print(p1)
cat(paste0("Saving image for ", VARIANT,".\n"))
ggsave(filename = paste0(RACER_loc, "/", tempNAME, ".", Today, ".",VARIANT,".",tempSTART,".",tempEND,".regional_assoc.png"), plot = p1)
ggsave(filename = paste0(RACER_loc, "/", tempNAME, ".", Today, ".",VARIANT,".",tempSTART,".",tempEND,".regional_assoc.pdf"), plot = p1)
ggsave(filename = paste0(RACER_loc, "/", tempNAME, ".", Today, ".",VARIANT,".",tempSTART,".",tempEND,".regional_assoc.eps"), plot = p1)
# print(ggplotly(p1))
rm(temp, p1,
temp_f,
tempCHR, tempSTART, tempEND,
VARIANT, tempNAME)
}Getting data for rs9633535.
Subset required data.
Formatting association data.
Formating association data...
Processing -log10(p-values)...
Preparing association data...
Getting LD data.
Plotting region surrounding rs9633535 on 10:63584853-63921073.
Association Data Set is missing LD data, the resulting plot won't have LD information, but you can add it using the ldRACER.R function.
Plotting by...
coord
Reading in association data
Collecting posterior probabilities
Generating Plot
Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
Saving image for rs9633535.
Saving 7.29 x 4.51 in image
Saving 7.29 x 4.51 in image
Saving 7.29 x 4.51 in image
Getting data for rs2854746.
Subset required data.
Formatting association data.
Formating association data...
Processing -log10(p-values)...
Preparing association data...
Getting LD data.
Plotting region surrounding rs2854746 on 7:45894617-46054070.
Association Data Set is missing LD data, the resulting plot won't have LD information, but you can add it using the ldRACER.R function.
Plotting by...
coord
Reading in association data
Collecting posterior probabilities
Generating Plot
Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
Saving image for rs2854746.
Saving 7.29 x 4.51 in image
Saving 7.29 x 4.51 in image
Saving 7.29 x 4.51 in image
Getting data for rs3844006.
Subset required data.
Formatting association data.
Formating association data...
Processing -log10(p-values)...
Preparing association data...
Getting LD data.
Plotting region surrounding rs3844006 on 6:131937915-132289374.
Association Data Set is missing LD data, the resulting plot won't have LD information, but you can add it using the ldRACER.R function.
Plotting by...
coord
Reading in association data
Collecting posterior probabilities
Generating Plot
Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale.
Saving image for rs3844006.
Saving 7.29 x 4.51 in image
Saving 7.29 x 4.51 in image
Saving 7.29 x 4.51 in image
Version: v1.3.0
Last update: 2022-02-03
Written by: Sander W. van der Laan (s.w.vanderlaan-2[at]umcutrecht.nl).
Description: Script to create plot regional association plots.
Minimum requirements: R version 3.4.3 (2017-06-30) -- 'Single Candle', Mac OS X El Capitan
Changes log
* v1.3.0 Added the credible sets to the aditional regions.
* v1.2.0 Added in aditional regions.
* v1.1.0 Created PNG and PDF of top loci regions.
* v1.0.0 Initial version.
sessionInfo()R version 4.1.2 (2021-11-01)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Monterey 12.2
Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] grid tools stats graphics grDevices utils datasets methods base
other attached packages:
[1] RACER_1.0.0 openxlsx_4.2.5 scales_1.1.1 ggsci_2.9 UpSetR_1.4.0 ggrepel_0.9.1
[7] ggpubr_0.4.0 forestplot_2.0.1 checkmate_2.0.0 magrittr_2.0.1 pheatmap_1.0.12 devtools_2.4.3
[13] usethis_2.1.5 BlandAltmanLeh_0.3.1 tableone_0.13.0 haven_2.4.3 eeptools_1.2.4 DT_0.20
[19] knitr_1.37 forcats_0.5.1 stringr_1.4.0 purrr_0.3.4 tibble_3.1.6 ggplot2_3.3.5
[25] tidyverse_1.3.1 data.table_1.14.2 naniar_0.6.1 tidyr_1.1.4 dplyr_1.0.7 optparse_1.7.1
[31] readr_2.1.1
loaded via a namespace (and not attached):
[1] minqa_1.2.4 colorspace_2.0-2 ggsignif_0.6.3 ellipsis_0.3.2 visdat_0.5.3 rprojroot_2.0.2 fs_1.5.2
[8] rstudioapi_0.13 farver_2.1.0 remotes_2.4.2 getopt_1.20.3 fansi_1.0.0 lubridate_1.8.0 xml2_1.3.3
[15] splines_4.1.2 cachem_1.0.6 pkgload_1.2.4 jsonlite_1.7.2 nloptr_1.2.2.3 broom_0.7.11 dbplyr_2.1.1
[22] compiler_4.1.2 httr_1.4.2 backports_1.4.1 assertthat_0.2.1 Matrix_1.4-0 fastmap_1.1.0 survey_4.1-1
[29] cli_3.1.0 htmltools_0.5.2 prettyunits_1.1.1 coda_0.19-4 gtable_0.3.0 glue_1.6.0 reshape2_1.4.4
[36] Rcpp_1.0.7 carData_3.0-5 cellranger_1.1.0 jquerylib_0.1.4 vctrs_0.3.8 nlme_3.1-153 crosstalk_1.2.0
[43] lmtest_0.9-39 xfun_0.29 ps_1.6.0 testthat_3.1.1 lme4_1.1-27.1 rvest_1.0.2 lifecycle_1.0.1
[50] rstatix_0.7.0 MASS_7.3-54 zoo_1.8-9 ragg_1.2.1 hms_1.1.1 RColorBrewer_1.1-2 curl_4.3.2
[57] yaml_2.2.1 gridExtra_2.3 memoise_2.0.1 pander_0.6.4 sass_0.4.0 stringi_1.7.6 maptools_1.1-2
[64] desc_1.4.0 zip_2.2.0 boot_1.3-28 pkgbuild_1.3.1 systemfonts_1.0.3 rlang_0.4.12 pkgconfig_2.0.3
[71] arm_1.12-2 evaluate_0.14 lattice_0.20-45 labeling_0.4.2 htmlwidgets_1.5.4 cowplot_1.1.1 processx_3.5.2
[78] tidyselect_1.1.1 plyr_1.8.6 R6_2.5.1 generics_0.1.1 DBI_1.1.2 pillar_1.6.4 foreign_0.8-81
[85] withr_2.4.3 survival_3.2-13 abind_1.4-5 sp_1.4-6 car_3.0-12 modelr_0.1.8 crayon_1.4.2
[92] utf8_1.2.2 tzdb_0.2.0 rmarkdown_2.11 readxl_1.3.1 callr_3.7.0 vcd_1.4-9 reprex_2.0.1
[99] digest_0.6.29 textshaping_0.3.6 munsell_0.5.0 bslib_0.3.1 mitools_2.4 sessioninfo_1.2.2
save.image(paste0(PROJECT_loc, "/",Today,".",PROJECTNAME,".RegionalAssociationPlots.RData"))| © 1979-2022 Sander W. van der Laan | s.w.vanderlaan[at]gmail.com | swvanderlaan.github.io. |